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IFDO/IASSIST  INTERNATIONAL  CONFERENCE 
AMSTERDAM  MAY  20-24,  1985 

THEME:  PUBLIC  ACCESS  TO  PUBLIC  DATA 


Public  data  like  census  data,  administrative  micro-data  and  survey  data,  offer  many  opportunities  for  research 
in  the  social  sciences  and  related  disciplines .  In  the  past  five  years  new  developments  in  computer  software  and 
hardware  have  accelerated  the  technical  availability  of  data  material .  Computemetworks,  'inteUigent'  terminals, 
micro-computers,  on-line  databases,  sophisticated  software  packages  etc.  make  it  technically  possible  to  access 
an  enormous  amoimt  of  data  for  scientific  research . 

With  these  developments  new  problems  emerged  concerning  the  access  of  the  data.  Existing  regulations  for  the 
control  of  the  flow  of  data  proved  to  be  inadequate .  Researchers  tried  to  find  new  ways  for  optimal  usage  of  the 
technical  possibihties  to  get  access  to  pubUc  data. 

The  next  IFDO/IASSIST  corderence  in  Amsterdam  will  be  an  excellent  opportunity  for  people  who  are  dealing 
with  pubhc  data  to  present  new  developments  in  this  area,  and  to  discuss  the  related  problems.  The  meetings 
wiU  include  the  discussion  of  papers  on  a  variety  of  topics  of  interest  to  social  scientists,  data  archivists,  librarians, 
research  administrators,  government  records  managers  and  users  of  data  banks . 

CONFERENCE  FORMAT 

The  conference  will  include  plenary  sessions  and  concurrent  sessions  with  presentations  and  demonstrations .  One 
day  of  the  conference  will  be  devoted  to  workshops  on  specific  topics  like  census  software  packages,  statistical 
programs  for  micro-computers  and  international  data.  Conference  language:  English. 

The  registration  fee  will  be  Dfl.  300  (app.  US$  100).  The  fee  includes  conference  activities,  workshops,  a  book 
containing  the  outlines  of  the  presented  papers,  co8"ee-  and  tea-breaks,  a  reception  and  a  number  of  meals .  The 
meetings  are  planned  in  the  conference  rooms  of  the  Grand  Hotel  Krasnapolsky,  in  the  very  center  of  Amsterdam. 

CALL  FOR  PAPERS 

Papers  are  being  soUicited  on  the  various  aspects  of  the  theme  of  Public  access  to  public  data  as  described  above . 
Abstracts  of  papers  for  the  conference  should  be  submitted  to  the  conference  program  committee  before  December 
1,  1984 .  Abstracts  should  be  typed  in  EngUsh,  with  a  maximima  of  500  words . 

CONFERENCE  ORGANIZATION 

The  International  Federation  of  Data  Organizations  for  the  Social  Sciences  (IFDO)  and  the  International  Association 
for  Social  Science  Information  Service  and  Technology  (lASSIST)  cooperate  in  the  organization  of  the  Amsterdam 
conference.  The  conference  will  be  hosted  by  the  Steinmetz  Archives  (the  Dutch  data  archive),  which  is  a  depart- 
ment of  the  Social  Science  Information  and  Documentation  Center. 

For  additional  information  write  to: 

STEINMETZ  ARCHIVES 
IFDO/IASSIST  CO>fFERENCE 

Herengracht  410-412 

1017  BX  Amsterdam 

THE  NETHERLANDS 

Telephone  (20)  225061 

Please  indicate  whether  vou  plan  to  submit  an  abstract! 
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Since  1984  marked  the  10th  anniversary  of  lASSIST  as  an  organization, 
I  felt  it  would  be  useful  to  focus  an  issue  of  the  QUARTERLY  on  the 
present  concerns  of  social  science  data  archives.  In  the  articles  that 
follow,  Anne  Gerken,  Bliss  Siman,  and  Gert  Lewis  discuss  the  organiza- 
tion of  an  archive  in  different  settings.  Patricia  Reslock  and 
Jacqueline  McGee  have  prepared  information  on  utilities  and  techniques 
for  for  handling  machine-readable  information.  An  overall  picture  of 
data  management  and  the  changing  role  of  archives  is  described  in  the 
articles  by  Ilona  Einowski  and  Jacqueline  McGee. 

These  articles  were  prepared  from  papers  presented  and  workshops  con- 
ducted at  the  last  two  lASSIST  annual  conferences.  The  sessions 
focused  on  the  management  of  archives  and  data,  with  representatives 
from  a  variety  of  settings  who  reported  on  the  functions  and  organi- 
zation of  their  particular  facilities.  I  hope  that  the  articles  can 
give  readers  at  least  the  flavor  of  the  lively  discussion  and  inter- 
action that  was  generated  in  these  gatherings. 

Finally,  many  many  thanks  go  to  Ilona  Einowski  and  Bliss  Siman  who 
were  the  co-editors  for  this  issue  of  the  QUARTERLY.  They  did  much  of 
the  tedious  and  time  consuming  work  of  gathering  and  preparing  the 
articles.  It  is  due  to  their  efforts  that  we  were  able  to  bring 
together  the  experiences  of  other  archivists  and  present  them  here. 
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Jacqueline  McGee 
The  Rand  Corporation 


The  comments  presented  by  me  today  will  necessarily  reflect 
my  view  of  the  research  community  from  my  position  in  the  Rand 
Computation  Center  Data  Facility.  It  is  hoped  some  of  my 
remarks  will  also  reflect  a  more  general  situation. 

The  topics  to  be  discussed  in  this  paper  are: 

1.  The  impact  of  government  policy  on  the  establishment 
and  maintenance  of  data  archives; 

2.  The  changing  role  of  data  archives;  and 

3.  The  changing  requirements  of  research  clientele. 

If  my  remarks  are  to  be  pertinent  for  this  session  I  feel  it 
is  necessary  to  provide  for  you  a  brief  historical  sketch  of  the 
Rand  Corporation.  Not  only  to  clarify  my  position  on  this  sub- 
ject, but  to  reflect  on  the  past,  in  order  to  describe  more 
clearly  the  present,  and  to  contemplate  the  future. 

Rand  History 

The  Rand  Corporation  was  organized  in  1946  as  the  RAND  Pro- 
ject when  the  then  Army  Air  Forces  (AAF)  awarded  to  the  Douglas 
Aircraft  Company,  a  contract  whose  purpose  was  to  provide 
scientific  advice  and  recommendations  to  the  AAF.  Two  years 
later,  to  prevent  what  was  seen  by  some  as  a  possible  conflict 
of  interest,  the  RAND  Project  became  the  Rand  Corporation  with 
private  funds  from  the  Ford  Foundation.  At  first,  work  at  Rand 
concentrated  mainly  on  projects  of  national  security  issues. 
However,  some  research  did  address  domestic  policy  issues 
especially  in  areas  of  transportation,  water  supply,  mental 
health  and  local  government  needs. 

Later,  in  1969,  at  the  request  of  Major  Lindsay  of  New  York 
City,  the  New  York  Rand  Institute  was  formally  established. 
The  work  of  the  Institute  proposed  to  aid  the  city  in  identify- 
ing opportunities  and  problems  and  assist  with  the  decision- 
making process.  The  Institute  work  included  the  active  involve- 
ment of  the  city  officials.  It  was  here  that  many  of  the  domes- 
tic research  projects  later  undertaken  at  Rand  had  their 
beginnings. 

Today  the  Rand  Corporation  is  a  private  nonprofit  research 
institution.  Also  housed  at  Rand  is  the  Rand  Graduate  Insti- 
tute. The  Institute  offers  a  graduate  program  leading  to  a 
doctoral  degree  in  oolicy  analysis.  Students  participate  in 
formal  academic  study  and  also  receive  on-the-job  training  in 
applied  research  while  participating  in  Rand  projects.  Research 
projects  are  conducted  in  three  major  divisions--Project  Air 
Force,  National  Security,  and  Domestic  Research--and  the  Civil 
Justice  Institute.  The  first  of  these  divisions.  Project  Air 
Force,  is  still  in  operation  after  37  years.  The  largest 


single  activity  at  Rand,  Project  Air  Force  contributed  approximately  37  percent 
of  the  total  revenue  in  Fiscal  Year  1983.  The  second  division.  National  Secur- 
ity, contributed  28  percent  of  the  fiscal  year  revenue.  The  third  division. 
Domestic  Research,  contributed  29  percent  and  the  Civil  Justice  Institute  con- 
tributed 5  percent. 

Research  staff  for  all  divisions  at  Rand  are  housed  in  six  research  depart- 
ments. Behavioral  Science,  Economics,  Engineering  and  Applied  Science,  Informa- 
tion Science,  Political  Science  and  System  Science.  Supporting  departments 
include  research  Libraries,  Publications  and  Computer  Services. 

Research  projects  draw  personnel  from  all  of  the  six  departments.  Policy 
analysis  and  research  at  Rand  has  been  described  as  "interdisciplinary." 
Interdisciplinary  research  requires  that  members  of  the  research  projects 
interact  and  integrate  their  separate  disciplines. 

Data  Facility 

The  Data  Facility  was  initiated  at  Rand  about  ten  years  ago.  Originally  the 
Facility  was  envisioned  as  a  much  larger  and  more  costly  operation  which  would 
have  included  data  management  for  projects,  but  a  decision  was  made  not  to  invest 
too  heavily  in  the  management  of  data  and  an  attempt  to  identify  possible  data 
files  for  archiving  became  the  Facility's  primary  focus  at  that  time. 

The  Data  Facility  (machine  readable  data  file--MRDF--archive)  resides  in  the 
Computer  Service  Department.  The  Facility  staff  assists  projects  in  identifying 
and  acquiring  data  files  from  both  Rand  and  non-Rand  sources,  and  serves  as  a 
central  clearing  house  for  the  acquisition,  dissemination,  control  and  storage 
for  MRDF.  The  Data  Facility  staff  attempts  to  acquire  and  archive  data  files 
with  a  high  probability  of  future  use  or  files  which  are  considered  generally 
useful  for  a  wide  variety  of  research  applications. 

The  Data  Facility  staff  assists  researchers  with  proposals  by  supplying 
information  about  the  availability  of  data,  in  the  archive  or  elsewhere,  the 
location  or  existence  of  variables  or  data  and  statistical  reference  sources. 
Project  programmers  and  researchers  using  archived  data  frequently  request 
assistance  with  the  use  of  a  particular  data  file,  clarification  of  codes,  or 
additional  reference  sources.  Our  facility  also  maintains  separate  storage 
for  software  products,  Rand-generated  programs  and  models,  user  manuals  and 
references. 

During  the  Facility's  developing  years  the  large  government  sponsored  social 
experiments  and  demonstration  projects  were  underway  at  Rand.  The  Housing 
Assistance  Supply  Experiment,  the  Health  Insurance  Study,  and  the  National 
Preventive  Dentistry  Demonstration  Project  were  a  few  of  the  large  studies 
undertaken  at  Rand  and  in  some  cases  are  only  now  coming  to  a  close.  I  suspect 
that  many  other  similar  large  data  collection  projects  elsewhere  led  to  the 
establishment  of  some  of  today's  archives. 

Government  Influence 

To  most  of  us  the  impact  of  government  policies  on  the  establishment  and 
maintenance  of  data  archives  is  fairly  obvious.  A  careful  inspection  of  many 
archive  inventories  would  indicate  that  much  of  the  data  collected  and  utilized 
in  the  research  process  begins  with  a  government  agency.   In  fact,  it  would  be 
difficult  to  locate  many  data  files  that  were  created  without  at  least  some 
government  funds. 

As  an  example,  the  Rand  Data  Facility  maintains  a  collection  of  census  data 
administrative  records  from  the  Bureau  of  Labor  statistics  and  Social  Security 
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Administration,  survey  data  from  the  Department  of  Health  and  Human  Resources, 
the  National  Center  for  Education  Statistics,  the  Economic  Census  from  the 
Bureau  of  Census  and  many  other  surveys  and  administrative  data  files.  The  most 
frequently  researched  stenographic  data  bases  in  the  Data  Facility  are  the 
National  Longitudinal  Labor  Market  Experience  surveys,  the  Panel  Study  of  Income 
Dynamics  and  the  two  student  data  files,  the  High  School  Class  of  1982  and  the 
High  School  and  Beyond.  All  of  these  projects  are  supported  with  government 
funds.  In  recent  years  government  agencies  funding  research  have  required  the 
creation  of  public  use  files  to  be  produced  and  archived  and  instituted  at  the 
completion  of  research  projects. 

Lately  research  centers  in  the  United  States  have  all  felt  the  effects  of 
the  federal  budget  cuts  not  only  in  the  granting  of  funds  for  research  but  in 
the  collection  and  dissemination  of  data  by  federal  agencies.  Many  agencies  are 
reducing  the  number  of  surveys  they  are  conducting,  reducing  the  number  of  data 
files  they  distribute,  or  reducing  staff  and  equipment  to  the  point  where  ser- 
vice to  the  user  community  is  affecting  the  acquisition  process. 

Perhaps  because  of  these  budget  cuts,  our  archive  has  experienced  an  increase 
in  the  demand  for  existing  data  for  use  in  secondary  analysis.  Certainly,  there 
has  been  an  increase  in  the  number  of  queries  about  the  holdings  of  our  archive 
by  researchers  anticipating  to  lower  the  cost  of  a  project  by  finding  data  to 
suit  his  needs. 

Archive  Future  Outlook 

Funding  Research.      It  would  be  safe  to  assume  that  the  process  of  funding 
research  is  changing  and  is  a  trend  that  will  probably  continue  for  the  fore- 
seeable future. 

Some  changes  in  research  funding  have  taken  place  at  Rand  in  recent  years. 
For  instance,  the  Rand  Civil  Justice  Institute  was  established  in  1979  to  per- 
form independent,  objective  policy  analysis  on  the  American  Civil  Justice  system. 
The  purpose  of  the  Institute  is  to  help  make  the  civil  justice  system  more 
efficient  and  more  equitable  by  supplying  policymakers  with  the  results  of 
empirically  based,  analytic  research.  An  example  of  the  projects  undertaken 
by  the  Institute  include  occupational  disease,  determinants  of  delay  to  case 
disposition,  history  of  court  congestion  and  delay,  selection  disputes  for 
litigation,  private  costs  of  civil  cases,  and  inflation  and  jury  awards. 

The  Institute  is  funded  by  a  broad  base  of  contributors  in  the  private  sector. 
Since  its  inception  over  200  organizations  have  made  contributions  to  the  Insti- 
tute, including  property-casualty  insurance  companies,  other  major  corporations, 
and  trade  and  professional  associations. 

In  addition,  in  1982  the  Private  Sector  Sponsors  Program  was  initiated  at 
Rand  to  facilitate  private  sector  support  of  research  benefitting  both  industry 
and  government.  Research  and  related  activities  in  this  new  program  will  be 
conducted  within  the  Domestic  Research  Division  under  regular  Rand  procedures 
and  organizational  structures. 

When  the  Data  Facility  was  first  established,  very  little  Rand  Project  data 
was  archived.  The  archive  holdings  were  generally  data  files  acquired  from  non- 
Rand  sources.  Today  we  are  gradually  acquiring  Rand-generated  data  from  com- 
pleted projects  and  assisting  a  much  larger  percentage  of  projects  with  their 
acquisition  of  data.  Proposal  support  remains  active. 

If  privately  funded  research  is  a  trend,  how  will  this  affect  the  data 
archive?  In  the  case  of  the  research  in  the  civil  justice  area,  machine- 
readable  data  rarely  exists.  It  is  my  hope  that  the  ongoing  projects  will 


& 


create  data  in  machine-readable  form  to  be  archived  for  future  use. 

Proliferation  of  Tn formation.      Recently  a  researcher  described  for  me  how 
five  years  ago  he  could  search  any  library  and  find  everything  he  needed  to  know 
about  a  particular  subject.  Today,  the  researcher  added,  the  information  avail- 
able is  more  specific  and  specialized  and  there  are  far  too  many  sources  for  him 
to  investigate  all  of  them. 

I  hope  our  archive  can  assist  Rand  researchers  with  diffusing  and  filtering 
and  in  some  cases  evaluating  information  that  may  be  of  interest  to  them.  It  is 
my  hope  that  the  Data  Facility  provides  our  clients  with  information  on  occasion 
which  they  might  not  have  had  or  perhaps  would  not  have  had  in  as  timely  a 
manner. 

Changing  Technology.      It  appears  the  trend  toward  micro-computer  use  is 
going  to  continue.  Not  only  will  it  continue,  but  rapidly  expand.  There  are 
many  different  opinions  as  to  the  effect  this  will  have  on  the  research  process 
and  the  use  of  mainframe  computers. 

One  theory  is  that  as  the  micro  use  goes  up  so  will  the  use  of  the  mainframe. 
I  believe  at  this  point  there  appears  to  be  some  justification  for  this  theory. 
At  this  stage  in  technology  it  is  still  necessary  for  any  but  the  smallest  data 
sets  to  be  aownloaded  from  a  mainframe. 

However,  I  have  had  requests  for  information  about  data  on  floppy  disks  and 
also  researchers  who  had  not  previously  been  so  inclined  now  are  requesting 
information  about  software  packages  and  utilities  on  the  mainframe,  in  order  to 
"talk"  to  it  from  their  micro. 

Conclusions 


Research  is  often  conducted  in  a  departmentalized  manner.  This  is  probably 
more  true  in  an  academic  setting  than  at  Rand,  but  to  some  extent,  it  is  also 
true  here.  Researchers  frequently  depend  on  colleagues  for  information  about 
data.  Research  conducted  in  a  departmentalized  manner  may  cause  some  information 
to  not  be  as  readily  available.  It  is  here  the  data  archive  can  best  assist  in 
a  research  center. 

The  data  archive  is  a  storage  of  information  about  data  files,  data  sources, 
and  data  information  from  many  different  subject  areas.  Another  valuable  service 
an  archive  can  provide  is  the  institutional  "memory,"  providing  researchers 
with  historical  information  about  past  data  uses  as  well  as  sources  of  informa- 
tion about  new  data  files. 

Just  as  important  as  the  hard  copy  documents  is  the  individual  who  provides 
a  human  trail  that  allows  researchers  to  follow  new  leads  to  people  who  have  had 
experience  with  a  particular  set  of  data,  or  special  knowledge  about  some  of  the 
files. 

Because  I  need  the  contacts,  and  need  to  trace  the  human  trail  for  the 
researcher,  it  is  necessary  to  maintain  contacts  with  other  managers  of  machine- 
readable  information.  For  this  reason  I  value  my  membership  in  organizations 
such  as  lASSIST  since  these  associations  help  me  to  maintain  a  standard  of 
professional  practice  and  expertise  I  might  not  otherwise  obtain. 
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Ann  E.    Gerken,    Data  Archivist 

Comell  Institute  for  Social  and  Economic  Research 

Cornell  University 


Historical  Background  and  Development 


The  Cornell  Institute  for  Social  and  Economic  Research  (CISER)  is  an  inter- 
disciplinary organization  of  Cornell  faculty  which  seeks  to  support,  strengthen 
and  enrich  the  social  and  economic  research  community.  In  May  1981,  CISER  was 
founded  to  develop  and  support  research  programs  and  provide  services  and  faci- 
lities required  for  research  projects. 

The  CISER  Data  Archive  was  established  in  February  1982  in  cooperation  with 
the  Cornell  University  Libraries  to  provide  central  access  and  management  for 
social  science  data  to  researchers  on  and  off  campus.  Data  Archive  staff  pro- 
vide professional  information  services,  technical  consultation,  and  research 
support.  CISER  also  sponsors  workshops  and  seminar  series,  peer  review  of 
research  proposals,  grant  management,  computing  facilities,  newsletters,  and 
a  directory  of  research  interests  of  faculty  in  the  social  sciences  at  Cornell. 

The  CISER  Data  Archive  was  established  upon  the  recomendations  of  a  committee 
made  up  of  representatives  from  four  colleges  at  Cornell,  the  university 
libraries,  Cornell  Computer  Services,  and  the  New  York  State  Cooperative 
Extension  Service.  The  committee  based  its  recommendations  upon  a  survey  of 
ten  data  archives  located  within  research  centers.  Information  was  gathered 
on  staff,  collections,  funding,  space,  and  computing  consulting. 

The  Data  Archive's  goals  are  to: 

1.  establish  and  maintain  a  centralized  archive  of  machine-readable  tapes 
and  documentation; 

2.  acquire  data  and  supporting  documentation,  coordinate  buying  consortiums, 
fill  gaps  in  data  file  holdings,  and  assure  the  safekeeping  of  archival 
holdings; 

3.  provide  an  information  center  with  professional  reference  and  computer 
consulting  in  social  science  data,  defining  information  needs  and  providing 
research  services;  and 

4.  support  the  research  and  service  missions  of  the  institute  and  the 
university. 

With  the  assistance  of  Cornell's  Social  Science  Librarian,  CISER  opened  the 
Data  Archive  in  early  February,  1982.  A  professional  archivist  joined  the  CISER 
staff  and  assumed  administrative  duties  as  well  as  the  planning  responsibilities 
for  the  development  of  the  archive.  Data  files  were  acquired,  policies,  mailing 
lists  and  ordering  procedures  were  established,  and  a  survey  of  faculty  was 
taken  to  identify  data  files  on  campus  and  those  needed.  The  survey  was  help- 
ful in  locating  data  files  to  incorporate  into  the  archive,  in  establishing 
contacts  with  researchers,  and  in  developing  a  collection  policy. 
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Since  its  opening  in  1982,  the  Data  Archive  has  grown  significantly.  Staff, 
services,  and  the  holdings  of  machine-readable  data  files  have  expanded.  The 
archive  is  an  increasingly  important  research  support  facility  on  campus,  pro- 
viding essential  services  to  a  wide  range  of  users.  In  addition  to  walk-in 
information  and  consulting  services,  the  archive  offers  workshops,  seminars, 
and  classroom  lectures  on  data  file  contents  and  use.  The  integration  of  com- 
puter consulting  with  data  file  reference  service  makes  the  archive  a  unique 
resource  at  Cornell.  The  staff  is  dedicated  to  the  provision  of  continuous 
support  to  the  educational  and  research  activities  in  the  social  sciences,  from 
data  information  to  advanced  analytical  support. 

Funding 

CISER  and  the  Data  Archive  are  supported  by  allocated  funds  from  five  colleges 
at  Cornell.  The  acquisitions  portion  of  the  budget  is  relatively  small  since 
most  data  sets  are  acquired  through  Cornell's  membership  in  the  Inter-university 
Consortium  for  Political  and  Social  Research,  through  New  York  State  Data  Center 
affiliate  status,  and  through  other  cooperative  agreements  with  state  and  federal 
agencies.  In  addition,  a  number  of  files  are  donated  to  the  archive  by  the 
faculty.  A  collection  policy  regarding  acceptance  of  donations  and  purchase 
decisions  is  vital  to  the  development  of  the  archival  holdings.  Computing  costs 
and  tape  storage  costs  are  separately  allocated. 

Staffing 

The  staff  of  the  archive  consists  of  a  professional  data  archivist,  two  full- 
time  computer  consultants,  and  a  half-time  data  manager.  Part-time  student 
assistants  are  also  available  during  the  academic  year. 

The  data  archivist's  duties  include  administration  of  the  archive,  data  file 
evaluation  and  acquisition,  data  file  information  services,  policy  making,  and 
the  development  of  new  services  for  social  science  researchers.  The  archivist 
works  closely  with  the  Cornell  University  Libraries  to  develop  integrated 
information  services,  and  communicates  with  the  Computing  Services  staff  in 
regard  to  technical  developments  and  services.  The  archivist  is  a  professional 
information  specialist  with  an  academic  research  library  background. 

The  computer  consultants  provide  statistical  computing  consultation  and 
are  responsible  for  the  technical  development  of  the  data  archive.  They  also 
work  on  contract  for  special  data  projects  producing  custom  files,  reports, 
analyses,  and  data  management.  The  consultants  have  social  science  backgrounds 
with  experience  in  statistical  analysis  and  computerized  research  techniques. 

The  data  manager  is  a  half-time  employee  with  tape  management  responsibilities 
This  person  keeps  inventories  of  tape  contents  and  data  users  and  oversees  the 
addition  and  copying  of  tapes  in  the  collection.  Fundamental  knowledge  of  the 
computer  and  tape  management  systems  is  required. 

Student  assistants  perform  tape  management  tasks  and  edit  inventory  files. 
Valuable  management  and  secretarial  assistance  is  also  available. 

Physical  Environment 


The  archive  is  housed  with  the  other  offices  of  CISER.  One  large  office 
houses  the  archivist,  the  data  manager,  and  the  library  of  technical  documenta- 
tion and  reference  materials.  Other  offices  house  the  consultants.  Each 
office  has  space  for  archive  users  to  examine  materials.  Print  materials  can 
be  taken  out  overnight.  The  consultants'  offices  have  enough  space  for  small 
group  instruction  and  storage  space  for  printouts  and  other  records. 
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As  for  local  equipment,  the  archive  has  an  IBM-PCl,  will  be  acquiring  addi- 
tional microcomputers,  and  has  two  terminals  which  are  used  to  communicate  with 
the  Cornell  mainframe  computers.  The  microcomputers  are  also  used  as  terminals 
for  communication  and  data  transfer,  as  text  processors,  and  for  social  science 
workstation  development,  including  database  management,  graphics,  and  custom 
programming.  Data  on  floppy  disketts  are  being  distributed. 

Public  computing  facilities  in  the  building  offer  state-of-the-art  graphics 
equipment,  high  speed  printers,  consultants,  and  technical  manuals.  CISER  also 
has  a  computing  facility  in  the  building,  with  terminals  for  use  by  CISER  mem- 
bers and  their  research  assistants.  Extensive  microcomputer  facilities  are  also 
located  in  the  same  building  as  the  Data  Archive  including  a  software  library 
and  a  demonstration  facility. 

Other  hardware  access  includes  an  IBM  3081D,  and  a  DEC  1020.  The  holdings 
of  the  archive  are  stored  at  the  mainframe  facility.  Tapes  are  used  on  the  IBM 
and  also  are  exported  for  use  on  other  systems  accessible  to  Cornell  researchers. 

Sources  of  Data 


The  CISER  Data  Archive  holds  machine-readable  data  in  the  areas  of  demography, 
vital  statistics,  health,  social  surveys,  labor  and  employment,  occupation  inter- 
national trade,  business,  service  industries,  education,  agriculture,  and  life 
studies  and  aging.  The  archive  comprehensively  collects  New  York  State  data. 

CISER  is  an  Affiliate  of  the  New  York  State  Data  Center,  and  acquires  many  of 
its  files  from  that  source.  Cornell  is  also  a  member  of  the  Inter-univeristy 
Consortium  for  Political  and  Social  Research  (ICPSR)  which  provides  the  majority 
of  non-Census  files  to  Cornell.  Longitudinal  data  are  acquired  from  the  Bureau 
of  Economic  Analysis.  The  archive  receives  data  from  numerous  government 
agencies,  both  federal  and  state,  and  also  acquires  data  files  from  other 
research  institutions  and  survey  centers.  The  contribution  of  research  data 
files  by  Cornell  faculty  members  have  been  central  to  the  development  of  the 
data  archive.  The  archive  seeks  continuous  data  deposits  to  build  longitudinal 
strengths.  Detailed  collection  development  policies  are  developed  in  collabora- 
tion with  faculty,  librarians,  and  members  of  the  Institute. 

Dissemination  of  Information 


To  make  users  of  the  archive  aware  of  the  holdings,  a  title  list  of  data  files 
in  the  archive  is  updated  and  distributed  bimonthly.  Information  about  the 
archive,  its  holdings  and  services,  is  included  in  the  bimonthly  newsletter 
from  CISER,  called  the  syntheCISER.  The  archivist  meets  with  faculty,  graduate 
students,  and  staff  to  advertise  and  promote  use  of  the  archive,  teach  methods 
of  identifying  data  files,  and  establish  a  network  of  data  users.  Future 
developments  will  include  online  directories  of  holdings  and  variable-level 
indexing  of  statistical  data  files.  Cooperative  cataloging  of  machine-readable 
records  is  being  investigated.  In  addition,  a  model  relationship  with  one  of 
the  Cornell  University  Libraries  has  been  established  whereby  professional 
staff  development,  acquisitions,  and  information  dissemination  is  coordinated. 

Data  are  disseminated  through  tape  and  disk  access  on  the  IBM  mainframe, 
through  file  transfer  to  tapes  and  diskettes,  and  in  special  data  files  and 
printouts  provided  on  custom  bases. 
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Users  of  the  CISER  Data  Archive 

The  users  of  the  archive  represent  the  many  colleges  and  departments  at 
Cornell,  and  numerous  off-campus  organizations.  Services  are  available  to 
faculty,  graduate  and  undergraduate  students,  staff,  off-campus  service  agencies, 
private  corporations,  government  agencies,  and  the  greater  Ithaca  community. 
Access  to  the  holdings  of  the  archive  on  the  mainframe  are  limited  to  those 
with  Cornell  computer  accounts.  Information  and  data  delivery  services  are 
available  to  others  on  a  contractual  basis,  except  when  data  are  restricted  to 
use  by  the  Cornell  community.  Fee  structures  have  been  developed  to  recover 
costs  of  some  tasks.  The  growth  of  the  archive  is  evident  in  the  increasing 
numbers  of  walk-in  and  returning  users. 

Future  Developments 

There  have  been  a  number  of  developments  that  present  challenges  to  CISER 
and  affect  the  long-term  plans  for  the  archive  and  the  Institute.  Among  the 
goals  is  the  expansion  of  the  archive  collection  and  services.  The  archive 
staff's  expertise  in  census  and  other  federal  data  products  has  brought  an 
increasing  number  of  requests  for  special  workshops,  tabulations,  and  data  file 
extracts  of  federal  data  files.  As  an  Affiliate  of  the  New  York  State  Data 
Center,  the  archive  frequently  provides  assistance  to  people  throughout  New 
York  State.  Requests  for  assistance  with  federal  data  from  the  Cornell  commun- 
ity and  off-campus  institutions  and  organizations  are  expected  to  increase. 

Researchers  require  subfiles  and  increased  consultation  for  the  larger  longi- 
tudinal data  sets  and  the  microdata  files  in  the  data  archive.  An  enlarged 
subsetting  service  and  increased  expertise  in  the  management  of  heavily  used 
data  sets  are  being  developed.  In  addition,  technical  support  agreements, 
workshops,  and  classroom  lectures  on  data  file  management  and  research  computing 
techniques  are  being  offered.  The  data  archive  must  keep  pace  with  the  rapid 
changes  in  computer  technology  and  the  applications  for  social  science  research. 
Especially  important  is  the  integration  of  microcomputer  workstations  into  the 
research  environment,  with  development  toward  multi-user  and  multi-level  comput- 
ing capabilities.  The  archive  staff  will  be  working  with  the  Cornell  Micro- 
computer Evaluation  and  Development  Facility  in  the  application  of  microcomputers 
in  social  science  research. 

The  data  archive  also  hopes  to  develop  cooperative  agreements  with  the 
Cornell  University  Libraries  in  efforts  to  integrate  services  and  encourage 
extended  participation  in  computerized  statistical  information  services.  The 
data  archivist  participates  in  the  development  of  an  integrated  library  system 
at  Cornell,  and  is  a  member  of  a  cross-campus  Working  Group  on  Statistical 
Data  Resources. 

Finally,  the  archive  staff  will  be  active  in  the  development  of  a  demographic, 
economic,  and  social  computerized  information  system  on  New  York  State,  to 
support  information,  research,  and  training  activities. 
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There  are  many  obstacles  to  success- 
fully acquiring  machine-readable  data. 
Difficulties  usually  occur  during  the 
physical  transfer  of  the  data  from  one 
site  to  another,  the  most  common  method 
being  magnetic  tape.  Yet,  even  with 
higher  quality  tape  and  more  efficient 
tape  drives,  tape  processing  is  still 
plagued  with  problems.  Parity  errors, 
incompatible  tape  labels,  multiple 
encoding  schemes,  and  poor  quality 
media  have  all  contributed  to  tape 
users'  frustration.  In  an  attempt  to 
educate  the  nontechnical  professional, 
this  paper  will  give  an  overview  of 
the  technical  aspects  of  magnetic  tapes 
and  their  use.  Topics  include  the 
physical  aspects  of  magnetic  tape,  how 
data  is  stored  on  tape,  recording  modes, 
tape  errors,  and  their  prevention. 

A  data  archivist  faces  two  main 
tasks:  data  acquisition  and  data  main- 
tenance. To  acquire  data  you  must  know 
how  data  is  stored  on  tape  and  what 
your  installation  can  handle.  The 
information  necessary  to  successfully 
transfer  data  between  sites  will  be 
covered  below,  as  will  housekeeping 
measures  essential  to  keeping  magnetic 
tapes  readable. 

Data  Acquisition 


The  physical   aspects   of  magnetic 
tape.    A  standard  magnetic  tape  has  a 
10?^-inch  hub  filled  with  '^-inch  wide 
tape.  The  tape  should  be  2400  feet 
+50  feet,  -0  feet.   It  is  important 
to  bear  this  in  mind  when  purchasing 
new  tapes,  as  tapes  of  less  than  2400 
feet  are  indicative  of  poor  quality 
control . 


A  magnetic  tape  is  composed  of  three 
layers.  The  first  layer  is  made  of 
oxide;  this  is  where  the  magnetized  data 
is  stored.  This  layer  must  be  smooth 
and  of  uniform  thickness  (0.00045  of  an 
inch).  The  second  layer  (the  binder) 
is  a  glue  used  to  make  the  oxide  adhere 
to  the  backing.  The  binder  must  be 
flexible  enough  to  reduce  oxide  chipping, 
yet,  not  too  sticky,  or  the  tape  will 
stick  to  itself.  The  last  layer  is  the 
backing,  and  it  is  usually  made  of 
mylar.  Creating  a  magnetic  tape  is  a 
complex  task;  hence  it  is  important  to 
have  rigid  quality  control  standards. 

There  are  three  important  points  to 
be  aware  of  when  exchanging  data  on 
tape:  1)  Is  the  tape  labeled  or  unla- 
beled? 2)  How  many  data  files  are  on 
the  tape?  and  3)  How  many  tape  reels 
does  the  data  span? 

Tapes  can  come  with  or  without  labels, 
and  files  can  come  as  a  single  file  on  a 
single  tape  reel,  multiple  files  on  a 
reel,  a  single  file  on  multiple  reels, 
or  multiple  files  on  multiple  reels 
(reference  (1),(2)).  You  must  know 
what  types  of  tapes  your  installation 
can  handle  and  what  types  of  utilities 
are  available  at  your  site  for  process- 
ing these  types  of  tapes.  Check  to  see 
if  one  form  is  easier  to  handle  than 
another.  If  so,  see  if  you  can  receive 
the  data  from  the  sending  organization 
in  that  form. 

Here  is  a  list  of  the  other  items 
you  need  to  know  when  receiving  a  tape: 

1.  Is  the  tape  9-track  or  7-track 
(7-track  is  fairly  old  technology, 
but  does  still  exist)? 


2.  How  large  is  the  blocksize  of  the 
data?  Some  systems  cannot  handle 
blocked  data. 

3.  At  what  density  was  the  data 
recorded  (6250  BPI ,  1600  BPI ,  800 
BPI)? 

4.  At  what  parity  was  the  data 
recorded  (even  or  odd)? 

5.  What  is  the  character  set  in 
which  the  data  was  recorded? 

Some  character  set  possibilities 
include: 

ASCII  American  National  Standard 
Code  for  Information  Inter- 
change. Most  commonly  found 
on  Digital  Equipment  Corpora- 
tion systems; 

EBCDIC  Extended  Binary  Coded  Decimal 
Interchange  Code.  IBM, 
Amdahl ,  Burroughs; 

BCD  Binary  Coded  Decimal  Charac- 
ter Code.  CDC6600,  Honeywell; 

FIELDATA  Standarized  Military  Data 
Transmission  Code.  Univac. 

The  main  point  to  all  this  is:  Know 
what  your  system  has  and  what  it  can 
handle.  Your  Computing  Services  depart- 
ment should  have  a  handout  telling  the 
easiest  way  to  receive  data.  This 
information  is  imperative  for  a  suc- 
cessful transfer  of  data. 

I  have  found  the  government  form 
"Transmittal  Form  for  Describing  Com- 
puter Magnetic  Tape  File  Properties"  to 
be  an  invaluable  reference  when  receiv- 
ing or  sending  data  (reference  (3)). 
You  can  use  this  form  in  one  of  two 
ways.  When  receiving  data,  fill  it  out 
the  way  you  need  to  receive  the  tape, 
and  send  the  form  along  with  your  data 
request.  When  sending  data,  send  the 
form  filled  out  with  the  specifications 
of  the  way  the  tape  was  created. 

Tape  Maintenance 


to  contamination  of  the  tape,  physical 
mishandling,  or  problems  with  tape 
drives  and/or  tape  cleaning  equipment. 
Here  are  some  items  that  you  want  to 
consider  to  keep  your  tapes  readable 
(reference  (4)). 

Reduce  contamination.      Most  tape 
contamination  comes  from  the  tapes 
themselves.  Oxide  chips  off  the  sur- 
face of  the  tape  each  time  the  tape  is 
used.  One  way  to  reduce  this  chipping 
is  to  buy  high  quality  tapes  made  with 
good  binder.  Other  contamination  comes 
from  carelessness  in  handling  tapes  or 
a  dirty  computer  room.  Some  ways  to 
reduce  contamination  are: 

1.  Reguarly  clean  tape  drives.  (At 
CNA  we  clean  the  drives  at  the 
start  of  every  shift  and  before  tape 
intensive  jobs. ) 

2.  Maintain  the  proper  recommended 
temperature  and  humidity--this 
helps  to  reduce  the  oxide  chipping. 

3.  When  cleaning  the  floors  around 
tapes,  clean  the  entire  floor  with 
a  damp  mop--DO  NOT  sweep,  dry  mop, 
or  dust. 

4.  Minimize  floor  waxing — if  you 
must  wax,  machine  buff  to  remove 
the  excess  wax,  damp  mop  with  cold 
water  to  harden  the  surface,  and 
buff  again  when  dry.  NEVER  use 
steel  wool  or  other  metal  abrasives 
for  buffing. 

Another  way  to  reduce  contamination 
is  to  regularly  clean  tapes  (once  every 
eight  uses).  Be  very  careful  if  you 
decide  to  do  this,  as  some  tape  clean- 
ers do  more  harm  than  good.  I  would 
not  recommend  using  a  tape  cleaner  on 
a  tape  that  is  error  free. 

Tape  drives  can  also  produce  tape 
errors.  Your  organization  should  have 
a  regular  tape-drive  maintenance  pro- 
gram where  a  field  engineer  does  rou- 
tine cleaning  and  alignment  of  the 
drives.  This  preventive  maintenance 
is  invaluable. 


How  to  minimize,   correct,   and  pre- 
vent errors.      Most  tape  errors  are  due 
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Long-term  storage.      Once  a  tape  has 
been  hanging  on  a  rack  for  about  six 
months,  it  starts  to  deteriorate. 
Pieces  of  dirt  and  chipped  oxide  start 
to  cause  dents  in  the  backing  of  the 
tape.  The  tape  may  become  unreadable 
because  the  dent  is  causing  the  tape 
to  be  too  far  away  from  the  read  head 
of  the  tape  drive.  The  best  way  to 
prevent  this  type  of  problem  is  to  spin 
tapes  every  six  months.  The  preferred 
method  is  to  have  a  program  that  scans 
the  data  on  the  tape  to  be  sure  the 
tape  can  be  read.  If  there  are  problems, 
you  will  know  it  early  and  will  be  able 
to  copy  the  data  to  another  reel.  If 
you  wait  for  a  long  period  of  time  to 
spin  a  tape,  the  damage  could  be  perma- 
nent. The  concept  is  the  same  as 
walking  with  a  rock  in  your  shoe.   If 
you  walk  a  block  and  take  it  out,  you 
probably  will  not  suffer  any  long-term 
damage.   If  you  walk  a  mile  with  a  rock 
in  your  shoe,  you  will  have  a  hole  in 
your  foot. 

Tape  handling.      Tips  for  tape 
handling  include: 

1.  Keep  tapes  in  the  computer  room. 

2.  Tapes  should  not  be  laid  on  top 
of  the  tape  unit. 

3.  External  tape  labels  should  be 
sticky  labels  that  peel  off  and 
leave  no  residue. 

4.  Never  allow  the  beginning  of  the 
tape  to  trail  on  the  floor. 

5.  Smoking  should  not  be  permitted 
around  tapes. 

Tape  storage.      Tapes  should  be 
stored  in  an  upright  position  in  a  cabi- 
net or  shelf  elevated  from  the  floor, 
and  as  far  away  from  sources  of  paper 
and  card  dust  (line  printers  and  card 
reader/punch)  as  possible. 

Backups  of  valuable  data  sets  should 
be  kept  off  site.  To  save  money,  look 
for  a  sister  institution  to  swap  tapes 
with  instad  of  paying  for  vault  storage. 

Tape  Transmittal.      Very  often  tapes 
are  damaged  in  the  mail  or  while  being 
hand  carried.  Tapes  sent  through  the 


mail  should  be  clearly  marked 
MAGNETIC  TAPE--KEEP  AWAY  FROM  ELECTRIC 
MOTORS,  SCANNING  DEVICES  AND  MAGNETS-- 
DO  NOT  X-RAY.  Be  sure  to  give  the 
above  adivce  to  the  courier  for  tapes 
being  hand  carried. 

Conclusions  and  Recommendations 

To  ease  data  transfer,  know  the 
answers  to  all  of  the  questions  on  a 
form  like  "Transmittal  Form  for  Des- 
cribing Computer  Magnetic  Tape  File 
Properties"  (reference  (3)).  Be  sure 
to  ask  for  some  type  of  a  dump  or  map 
of  the  tape  being  created.  This  makes 
the  sending  installation  look  at  the 
tape  after  it  is  written--that  is,  it 
forces  them  to  be  sure  something  was 
written  to  the  tape.  Finally,  be  sure 
to  get  documentation  on  the  data. 
These  items  include  record  layout, 
descriptions  of  all  the  variables  and 
how  they  were  derived,  and  the  count 
of  the  total  number  of  records  in  each 
file. 

To  insure  in-house  data  reliability, 
scan  data  tapes  every  six  months  to  a 
year  to  be  sure  they  are  readable.  At 
first  sign  of  trouble,  recopy  the  data. 
Recopy  vauable  data  sets  to  NEW  tapes 
every  three  or  four  years.  Try  to 
maintain  cleanliness  at  your  sit.  DO 
NOT  use  a  cheap  tape  cleaner.  It  will 
do  more  damage  than  good.  Most  impor- 
tant of  all:  buy  the  best  quality  tapes 
you  can  afford. 

Referenaes 

(1)  VAX/VMS  Magnetic  Taoe  User's  Guide, 
VAX/VMS  Version  3.0.  Digital 
Equipment  Corp,  Maynard,  MA;  Chap- 
ter 2.  May,  1982. 

(2)  U.S.  Department  of  Commerce. 
National  Bureau  of  Standards.  Mag- 
netic Tape  Labels  and  File  Struc- 
ture for  Information  Interchange. 
Federal  Information  Processing 
Standards  Publication  79.   1980. 

(3)  U.S.  Department  of  Commerce. 
National  Bureau  of  Standards. 
Transmittal  Form  for  Describing 
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Profile 

Rutgers,  the  State  University  of 
New  Jersey,  which  was  chartered  as 
Queen's  College  in  1766,  and  was  desig- 
nated a  state  university  in  1945,  is  a 
large  university  offering  a  variety  of 
learning  environments.  Today  the  Uni- 
versity has  over  47,000  students 
enrolled  in  six  separate  colleges  on 
the  four  campuses  in  New  Brunswick, 
and  the  campuses  at  Neward  and  Camden 
each  of  which  is  over  50  miles  away 
from  the  New  Brunswick  site.  There 
are  24  instructional  divisions  and 
about  16  affiliated  research  units. 

Facilities  and  support  for  academic 
computing  are  managed  by  the  Center  for 
Computer  and  Information  Services 
(CCIS)  which  provides  services  to 
students  and  faculty  who  use  computing 
for  instruction  and  research  purposes. 
Services  include  non-credit  education 
courses,  a  reference  center,  a  news- 
letter, maintenance  of  terminals  and 
remote  job  entry  facilities,  equipment 
loaned  to  classrooms,  program  packages 
support,  data  archives  and  data  base 
management,  system  programming,  docu- 
mentation, accounting  and  billing. 

In  1971,  the  Princeton-Rutgers  Census 
Data  Project  came  into  being  through  the 
combined  efforts  and  finances  of  both 
universities.  Since  there  had  already 
existed  a  tradition  of  cooperation  be- 
tween the  two  universities  on  special 
data  collections,  they  decided  to  share 
the  purchase  of  1980  Census  data  jointly. 
The  project  was  organized  with  the 
support  of  the  Center  for  Research 
Libraries  and  financial  contributions 
from  the  libraries  of  Princeton  and 


Rutgers  as  well  as  interested  departments 
on  both  campuses.  Through  START-1  pro- 
gram, the  Data  Use  and  Access  Laboratory 
(DUALabs),  a  non-profit  organization, 
purchased  the  1970  Census  tapes  as  they 
became  available  from  the  Census  Bureau, 
processed  the  tapes,  condensed  the  data, 
and  sold  copies  at  a  reduced  cost  to  its 
members.  Along  with  data  modifications, 
several  computer  programs,  known  as  the 
MOD  series,  were  developed  to  access  the 
data  and  were  installed  at  Rutgers  and 
Princeton  Universities.  All  the  tapes 
were  stored  at  Princeton,  and  Rutgers 
copied  only  those  pertinent  to  its 
researchers.  In  order  to  administer  the 
project,  both  universities  were  respon- 
sible for  publicity,  training,  and 
physical  tape  maintenance. 

The  project  became  self-sustaining 
by  charging  outside  organizations  for 
programming  fees  and  computer  cost  which 
then  covered  the  purchases  of  new  tapes. 
The  project  has  continued  to  promote 
collaborative  efforts  of  cooperation  and 
support  between  the  two  universities. 

ICPSR  and  ROPER  Memberships 

In  1965,  a  member  of  the  Political 
Science  department  requested  membership 
in  ICPSR.  A  few  years  later,  member- 
ship in  ROPER  was  established  by  Poli- 
tical Science  department  and  then  trans- 
ferred to  Sociology.  As  the  membership 
in  these  organizations  became  known, 
an  increasing  number  of  researchers  were 
discovering  that  the  data  produced  by 
national  archives  have  intrinsic 
research  and  academic  value.  As  inter- 
est in  these  memberships  increased,  it 
became  apparent  that  the  individual 


departments  could  not  handle  the  work- 
load. Since  the  Princeton-Rutgers 
Census  Data  Project  had  been  function- 
ing successfully,  the  CCIS  decided  to 
centralize  other  data  bases  in  the  same 
manner.  The  library  assumed  the  opera- 
tional control  of  transferring  all  the 
relevant  information  and  materials  from 
the  individual  departments  and  of  dev- 
eloping adminstrative  and  ordering 
procedures  to  facilitate  the  acquisition 
of  data. 

Data  Base  Advisory  Committee  (DBAC) 


As  to  the  administration  of  the  ROPER 
and  ICPSR  memberships,  the  CCIS  favored 
the  creation  of  the  Data  Base  Advisory 
Committee  to  insure  adequate  communica- 
tion between  the  CCIS,  the  libraries, 
and  the  departments,  to  determine 
University  policy  concerning  future 
data  acquisitions  and  to  select  offi- 
cial representation  to  the  ICPSR  and 
ROPER  organizations.  The  Committee, 
established  by  the  Director  of  the  CCIS, 
consisted  of  representatives  from  the 
CCIS,  the  Library  and  the  political  and 
social  science  departments  on  the  New 
Brunswick  campus.  After  the  first 
meeting,  it  was  expanded  to  include 
representatives  from  the  Camden  and 
Newark  campuses,  also.  When  a  discus- 
sion of  the  budget  for  the  ROPER  mem- 
bership led  to  a  joint  membership  by 
the  Rutgers  and  Princeton  Libraries,  a 
representative  from  Princeton  joined 
the  committee.  Although  the  committee 
is  limited  to  a  maximum  of  ten  members, 
guests  are  invited  and  welcome.  This 
committee,  which  convenes  one  or  twice 
a  year,  discusses  allocation  of  avail- 
able resources  in  the  departments, 
decides  who  shall  represent  Rutgers  at 
the  ICPSR  Conference,  and  awards  any 
scholarship  to  ICPSR  science  programs 
that  become  available.  Communications 
by  mail  and  phone  are   conducted  con- 
tinually with  committee  members  on 
relevant  data  matters  as  the  need  arises. 

The  New  Jersey  State  Data  Center 

In  anticipation  of  the  large  amounts 
of  data  produced  by  the  1980  decennial 


census,  the  U.S.  Bureau  of  the  Census 
established  a  State  Data  Center  Program 
throughout  the  country  to  improve  access 
to  and  use  of  census  data  products. 
Rutgers  University,  as  a  primary  parti- 
cipant of  the  New  Jersey  State  Data 
Center  (NJSDC),  documents,  distributes, 
and  publicizes  these  materials.  The 
CCIS  has  also  made  available  the  Census 
Software  Package  (CENSPAC),  which  is  an 
all  purpose  statistical  and  retrieval 
program  created  by  U.S.  Bureau  of 
Census  to  be  utilized  with  the  Census 
data. 

Funding 

The  data  activities  fall  within  the 
Applications  Group  of  the  Center  for 
Computer  and  Information  Services  which 
provides  the  facilities  and  the  support 
services  for  academic  (instructional 
and  research)  computer  users.  No  salary 
lines  are  designated  specifically  for 
the  data  archives.  Our  programmers  are 
responsible  for  computer  expertise  on 
our  software  and  hardware  for  all  our 
computer  systems. 

Travel  requests  are  considered  on  an 
individual  basis  depending  upon  the 
overall  requests  for  travel  within  the 
budget  limitations.  This  fiscal  year, 
I  felt  very  fortunate  to  attend  the 
State  Data  Center  conference,  the 
Association  of  Public  Data  Users,  and 
this  meeting  of  the  International  Asso- 
ciation for  Social  Science  Information 
Service  and  Technology.  But  our  staff 
participation  in  such  events  varies  from 
year  to  year. 

The  Rutgers  membership  in  the  Inter- 
University  Consortium  for  Political  and 
Social  Research  and  the  Rutgers-Princeton 
joint  membership  in  the  Roper  Center  are 
financed  through  the  library  budget.  If 
some  departments  request  the  purchase  of 
data  outside  of  these  memberships,  the 
computer  center  coordinates  the  search 
for  funding  it. 

Overhead  expenses  for  office  space, 
secretarial  staff,  mail,  postage,  tele- 
phone, etc.,  are  not  being  considered 
here  because  these  were  already  in 
existance  when  data  archiving  activity 


came  into  being.  The  cost  and  mainte- 
nance of  the  computer  equipment  and 
cost  of  data  processing  come  out  of  our 
current  operating  budget. 

Although  most  of  the  computing  with 
the  machine-readable  data  is  used  on 
our  IBM  mainframe,  some  is  also  utilized 
on  the  VAX  11/780,  which  has  SPSS  and 
SCSS,  and  the  DEC  2060  which  stores  a 
CITIBASE  data  file.  The  various  depart- 
ments are, allotted  a  specific  dollar 
amount  for  computing  time  which  is 
based  on  previous  year's  usage  and 
future  estimates  of  need. 

Staffing 


At  the  time  of  the  implementation  of 
the  Rutgers-Princeton  Census  Data 
Project,  a  half-time  programmer  analyst 
line  was  created  to  carry  it  out.  When 
all  the  data  activities  were  centralized 
at  the  computer  center,  the  responsibi- 
lities were  expanded  and  half-time  of 
another  programmer  position  was  included. 
Unfortunately,  this  past  year,  because 
of  many  changes  in  personnel  and  the 
addition  of  a  new  computer,  we  lost 
ground  in  this  area.  Now  less  than 
one  full-time  line,  shared  among  three 
people,  is  devoted  to  machine-readable 
data  file  activities.  And  it  is  not 
enough.  The  first  nine  months  of  this 
academic  year  abour  250  consultations 
were  recorded  or  two  or  three  requests 
on  an  average  daily  basis.  This  figure 
does  not  include  quick  references  in 
the  libraries  or  computer-related  pro- 
blems which  may  go  to  the  statistician 
or  Aid  Station.   (There  are  Aid  Sta- 
tions on  each  of  the  campuses  which  are 
staffed  by  students  and  the  computer 
center  staff  to  aid  in  debugging  all 
user  problems. ) 

Sources  of  Data 


During  this  academic  year  we  have 
serviced  more  than  23  different  depart- 
ments on  campus.  Their  data  requests 
have  referred  to  many  different  studies 
in  many  different  fields.  Requests  for 
our  census  service  are   just  as  likely 
to  come  from  outside  the  University, 
particularly  non-profit  county  and 


state  agencies  as  from  within  the 
University.  The  level  of  sophistica- 
tion in  handling  MRDF's  ranges  from 
zilch  to  familiarity  with  statistical 
packages  on  the  computer.  All  manner 
of  problems  come  to  the  CCIS  Aid  Sta- 
tions, the  statisitical  consultants, 
and  our  staff  during  any  given  day. 
In  general,  the  procedure  in  handling 
inquiries  is  fairly  routine.  First, 
we  check  our  Rutgers  University  Guide 
to  Machine-Readable  Data  Files  to  see  if 
the  file  requested  is  already  on  campus. 
If  not,  the  catalogs  of  ICPSR,  the 
Roper  Center,  and  the  Bureau  of  the 
Census  are  searched  for  the  particular 
file  or  subject  requested.  Data  from 
the  first  two  are  easily  obtained  be- 
cause of  the  memberships  we  maintain 
with  these  groups.  The  census  inquir- 
ies require  a  different  approach. 
Requestors  are  directed  first  to  the 
printed  reports.   If  the  information  is 
available  only  on  tape,  the  researcher 
is  assisted  in  ascertaining  what  tape 
contains  the  needed  data,  what  census 
geographic  area  will  most  suit  the 
needs  of  the  project,  and  which  program 
should  be  utilized.  If  the  data  needed 
is  from  a  source  which  requires  a  cash 
outlay,  the  staff  assists  the  research- 
ers in  finding  funding,  if  at  all 
possible. 

It  is  difficult  to  determine  which 
files  are  heavily  used.  The  number  of 
tape  mounts  does  not  give  an  accurate 
picture  of  how  frequently  the  data  is 
accessed.  Most  users,  after  accessing 
the  tape  once  or  twice,  create  a  sub- 
file on  their  own  and  continue  their 
analytic  studies  on  the  smaller  file. 
The  sophisticated  users  know  how  to 
find  out  the  tape  information  without 
checking  with  us.  At  the  present  time, 
the  most  frequently  used  studies  appear 
to  be  the  STF  3A  tape  from  the  1980 
Census  of  Population  and  Housing,  the 
NORC  General  Social  Surveys,  the  Amer- 
ican National  Election  Studies  from 
Michigan,  and  the  National  Longitudi- 
nal Studies  from  Ohio  State. 
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Training/Workshops /Seminars.    CCIS  is 
always  looking  for  ways  to  reach  more 
Rutgers  users.  In  the  beginning  of  each 
semester,  two  session  seminars  are  con- 
ducted on  familiarizing  the  researchers 
with  the  content  of  1980  Census  of 
Population  and  Housing  and  other  Census 
products.  Another  class  is  held  on 
the  general  Data  Archives  to  describe 
the  types  of  data  available  for  research 
and  study  purposes.  Special  seminars 
or  workshops  are  conducted  at  the 
request  of  individual  units  within  the 
university  and  are  tailored  to  their 
particular  interests  and  needs.  The 
close  association  with  the  reference 
librarians  is  reflected  in  a  special 
seminar  for  the  Reference  Special 
Interest  Group  on  the  resources  at  the 
CCIS  with  special  emphasis  on  the  1980 
Census. 

Publications /Articles /Documents. 
Articles  on  data- related  information 
appear  regularly  in  the  CCIS  bi-monthly 
Newsletter,  but  CCIS  also  publishes  a 
number  of  Technical  Documents  pertain- 
ing to  machine-readable  data  files  and 
the  computer  programs  available  to 
access  them.  Our  publications  are  all 
geared  towards  making  data  use  easier 
for  the  University  community.  As  an 
example  of  this  type  of  publication, 
information  was  extracted  from  the 
Master  Area  Reference  File  for  the  1980 
Census  of  Population  and  Housing  data 
and  sent  to  the  reference  librarians 
in  all  the  libraries  on  all  the  Rutgers 
campuses.  This  computer  output  included 
not  only  the  census  geographic  codes 
and  corresponding  area  names,  but  also 
included  an  index  by  countries,  an 
explanation  of  the  symbolic  codes  and 
total  counts  for  population,  housing, 
and  families.  Complementing  this  will 
be  our  output,  probably  on  microfiche, 
on  county  and  MCD  by  zip  code  for  dis- 
tribution to  the  libraries.  Finally, 
our  most  important  publication  is  the 
previously  mentioned  Rutgers  University 
Guide  to  Machine-Readable  Data  Files, 


an  index  of  all  machine-readable  data 
Files  on  campus. 


Consultation  Services.      For  guidance 
and  assistance  in  using  any  of  the 
machine-readable  data,  the  CCIS  offers 
a  consultaiton  service,  free-of-charge, 
to  direct  users  of  the  data,  to  help 
users  select  the  computer  program  best 
suited  for  the  user's  need,  to  provide 
necessary  program  and  technical  docu- 
mentation, and  to  assist  in  analyzing 
computer  error  messages  should  they 
occur. 

Computer  Reference  Center.      Under 
the  auspices  of  the  CCIS  is  the  Computer 
Reference  Center  (CRC),  a  library  of 
computer-related  materials.  All  the 
codebooks ,  manuals,  and  reference  books 
are  located  in  the  Data  Archive  Corner 
to  facilitate  accessibility  to  the 
widest  possible  range  of  users.  These 
codebooks  themselves  can  be  useful  tools 
in  data  analysis,  sometimes  eliminating 
the  need  to  access  the  files  by  computer. 
In  addition,  catalogs  of  data  holdings 
of  several  institutions  which  collect 
and  disseminate  data  are  available  as 
well  as  computer-related  periodicals. 
An  information  specialist  who  maintains 
and  updates  these  materials  assists 
users  in  finding  the  information  they 
need. 

Future  Developments 

Out  future  plans  center  on  profes- 
sional development  for  the  staff, 
improvement  of  our  excellent  Census 
service,  and  expansion  of  the  contract 
programming  activity.  In  addition  to 
in-house  training  workshops,  staff  are 
encouraged  whenever  possible  to  attend 
conferences  and  workshops  which  enhance 
their  professional  skills.  It  is  hoped 
that  more  money  can  be  made  available 
for  such  attendance  in  the  future.  Along 
with  attendance  at  these  functions, 
staff  are  also  urged  to  participate  in 
the  related  organizations  which  sponsor 
these  meetings,  such  as  APDU  or  lASSIST. 
These  organizations  do  invaluable  work 
in  fostering  increased  awareness  of 
MRDF's  both  on-campus  and  in  the  wider 
academic  world.  Such  participation 
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Utilities  are  the  tools  that  allow  the  data  archive  to  properly  archive  and 
maintain  the  contents  of  their  holdings.  They  are  the  programs  we  use  to  sur- 
vey or  verify  a  tapes  content,  to  copy  tapes  or  print  records,  and  to  resolve 
tape  problems.  Many  archives  will  have  their  own  set  of  tools  but  for  the  new 
archivist  there  is  much  to  learn.  It  is  hoped  this  paper  will  encourage  the 
new  archivist  to  look  beyond  their  own  office  for  tools. 

Sometimes,  even  the  experienced  archive  staff  are  not  aware  of  the  software 
packages  available  in  their  computer  centers.  Often  statistical  data  manage- 
ment packages  have  useful  utilities  the  archive  might  use.  This  article  will 
describe  the  utilities  used  most  frequently  at  The  Rand  Corporation's  Data 
Facility.  A  few  hypothetical  cases  may  clarify  what  utilities  are  available 
and  will  also  illustrate  the  use  of  some  of  the  utilities  for  the  new  archivist. 

Also  in  this  article,  many  of  the  examples  used  will  be  SAS  programs.  The 
reason  for  this  is  the  programs  do  the  job  and  they  are  available  at  Rand.  I 
encourage  the  reader  to  make  enquiries  about  similar  kinds  of  software  available 
to  them. 

EXAMPLE  #1.  Tape(s)  Received  for  Archiving  from  an  Outside  Source 

In  this  example  our  hypothetical  tape  has  just  arrived  on  our  desk  for 
archiving.  The  worse  possible  situation  which  might  occur  would  be  that  the  tape 
itself  has  absolutely  no  information  about  its  contents  nor  is  it  accompanied 
by  any  documentation.  This  occurs  more  frequently  when  project  personnel  order 
a  tape  or  data  over  the  phone  and  when  the  tape  arrives  they  bring  >it  to  the 
archive  for  assistance  with  accessing  the  tape.  Or  in  another  case  the  tape 
labels  indicate  one  type  of  contents  and  accompanying  documents  differ  in  one 
or  two  of  the  necessary  parameters.  Hopefully  the  tape  will  be  fully  documented. 

TAPEMAP  (or  survey  or  scan):  This  utility  will  read  the  tape  looking  for 
labels.   If  labels  are  found  then  the  information  on  the  labels  is  printed  out: 
if  no  labels  are  found  then  the  blocksize  and  density  of  the  tape  are  sensed  by 
the  computer  and  that  information  is  given. 

At  Rand  and  probably  in  most  computer  centers  there  is  more  than  one  TAPEMAP 
utility.  Usually  the  utility  to  use  is  dictated  by  the  problem  and  the  tape 
involved.  SAS  also  has  a  mapping  program  with  the  inappropriate  name  of  "SAS 
Label."  It  also  provides  the  actual  footage  of  tape  used.  Following  is  an 
example  of  the  JCL  for  a  SAS  map  and  a  copy  of  the  ouput.   (This  output  has 
been  edited  slightly  to  accomodate  the  narrow  page.) 


Input: 

//D0004SAS  JOB  (2082,30,393),  '  SAS  MAP',CLASS=S 

//  EXEC  SAS,0PTI0NS='N0NEWS,S=72' 

//TAPEl  DD  UNIT=HIGH9, 

//        V0L=SER=004670, 

//        DIS^OLD 

PROC  TAPELABEL  DDNAME=TAPE1 ; 


SAS 


TAPE  LIST  FOR  DDNAME  -  TAPEl 


BLOCK   EST. 


Output: 

CONTENTS  OF  TAPE  VOLUME  -  004670 

FILE 

NUMBER  DSNAME    RECFM  LRECL  BLKSIZE  COUNT   FEET  CREATED  EXPIRES  JOB 

1   STF3A.0R41  FB     2016   24192    2730  1018.2  01MAY84  0000000  D0004 

1018.2 

EXAMPLE  #2. 


The  Tape  Received  Has  a  Data  Check  or  Cannot  Be  Read  by  Any  Mapping  Utility 

To  resolve  this  problem  we  use  a  utility  called  Fast  Analysis  of  Tape  and 
Recovery.  This  utility  reads  everything  on  the  tape.  When  it  cannot  read  a 
section  the  tape  is  run  back  and  forth  over  the  drive,  cleaning  any  specks  of 
dust,  etc.,  and  then  provides  an  output  on  the  condition  of  the  tape. 

EXAMPLE  #3.  Tape  Cannot  Be  Accessed  by  the  Usual  Methods 

Very  often  the  resolution  to  this  problem  is  simple.  The  utilities  we  use 
frequently  expect  to  find  data  (most  utilities  read  a  label  as  data)  at  the 
beginning  of  a  tape.  Infrequently  we  receive  tapes  with  a  tape  mark  at  the 
beginning  of  a  tape  and  the  data  is  the  second  'file'  on  a  tape.  Most  utilities 
would  not  get  past  the  tape  mark.  The  resolution  is  to  try  the  utility  again 
and  'read'  the  second  file  on  a  tape.  Instead  of  "label= (.blp)"  use 
"label=(2,blp)"  for  jobs  requiring  JCL. 

EXAMPLE  4.  Contents  of  the  Data  to  Be  Verified  and  Tape  Copied 


Once  we  have  verified  the  tape  has  the  properties  we  expected  or  we  have 
learned  the  true  specifications  the  next  step  is  to  verify  the  contents.  Do  the 
records  have  the  right  format?  Are  the  codes  found  in  the  records  correct?  If 
the  initialmap  information  appears  to  be  okay  it  may  be  a  good  idea  to  copy  the 
tape  now.  Checking  records,  etc.,  is  often  easier  with  tapes  which  are  for- 
matted to  in-house  specifications. 

To  print  records:  Usually  we  use  an  online  utility  which  sets  up  the  JCL 
required  for  a  software  package  called  DYLAKOR.  With  this  utility  we  provide 
the  tape  number,  the  label  where  the  data  resides  and  the  dataset  name.  The 
program  works  just  as  well  if  the  tape  is  nonlabeled. 

As  mentioned  previously,  we  often  make  use  of  SAS  in  our  archive  management. 
The  following  program  is  useful  for  printing  records  from  any  tape  (the  tapes 
do  not  have  to  be  in  SAS  format.)  The  program  does  require  that  one  variable 


on  the  tape  be  identified  and  the  location  of  the  field  in  the  record  where  the 
variable  on  the  tape  be  identified  and  the  location  of  the  field  in  the  record 
where  the  variable  can  be  found.  Following  is  an  example  of  a  SAS  program  which 
will  print  out  a  specified  number  of  records  (OBS^lOO  in  this  case.) 

//JOB  CARD 

//EXEC  SAS 

//TAPE  DD  DSN=data. set. name, 

//  VOL=SER=volser,UNIT=TAPE,DISP-OLD 

DATA; 

INFILE  TAPE  0BS=100; 

INPUT; 

LIST; 

In  the  previous  example  if  your  tape  is  non-labeled  (NL)  give  the  DCB  infor- 
mation either  in  the  JCL  or  in  the  INFILE  statement: 

(DCB=RECFM=recfm,LRECL=lrecl ,BLKSIZE=blksize).  If  the  DCB  information  is  not 
known  SAS  assumes  a  BLKSIZE=32767,  RECFM=U.  The  output  of  a  non-label  tape 
using  these  defaults  will  give  you  a  readable  dump  allowing  you  to  see  what  the 
data  looks  like  and  perhaps  identifying  the  format. 

EXAMPLE  #5.  Making  Tape  .Copies 


With  an  IBM  computer  and  a  straightforward  tape  the  most  reliable  copy  pro- 
gram is  lEBGENER  and  this  program  is  documented  in  the  IBM  manuals.  However, 
as  is  also  often  the  case,  there  are  other  programs  which  may  save  time  and 
effort.  With  lEBGNENER  each  data  set  on  a  tape  must  be  described  in  the  JCL. 
When  a  tape  has  many  data  sets  this  can  take  considerable  time.  For  this 
reason  we  use  a  number  of  copy  utilities,  some  of  which  were  written  at  Rand 
and  are  only  useable  here. 

However,  SAS  has  a  nice  tape  copy  utility  which  is  very  useful.  This  pro- 
gram should  not  be  confused  with  their  PROC  "COPY"  for  copying  SAS  datasets. 
The  SAS  utility  "TAPE  COPY"  may  be  used  to  copy  any  data.  The  nicest  feature 
about  this  program  is  the  ability  to  copy  all  files  merely  by  writing  in  a 
range  (i.e.,  files  1-7)  or  to  copy  files  in  a  mixed  order  (i.e.,  files  1,2, 
9,4,3).  Following  is  an  example  of  a  SAS  copy  program: 

//D0004SAS  JOB  (2082 ,200,393) ,' MCGEE' ,CLASS=S 

//EXEC  SAS 

//VOLIN  DD  UNIT=HIGH9,DISP-0LD, 

//  V0L=SER-004743,LABEL=(,BLP) 

//VOLOUT  DD  UNIT=HIGH9,DISP=(NEW,KEEP), 

//  V0L=SER-004195,LABEL=( ,BLP) ,DCB=DEN=4 

PROC  TAPECOPY; 

FILES  1  8-10; 

Example  of  Concatenating  Files  on  More  Than  One  Reel 


Example:  If  the  data  is  contained  on  more  than  one  reel,  but  tapes  are  not 
concatenated.  This  type  of  file  requires  extra  programming  effort  by  a  user 
when  accessing  the  tapes.  Concatenating  the  tapes  when  copying  will  save  pro- 
gramming time  later. 
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// 

//IN 

DD 

// 

DD 

// 

DD 

JCL  to  concatenate  tapes: 

//JOB  CARD 

//STEPl  EXEC  lEBGENER 

//OUT   DD  DSN^NEW. DATA. FILE, DISP=(NEW, KEEP) 

//         UNIT=HIGH9,V0L-SER=000000,LABEL={1,SL), 

DCB=(RECFM=FB,LRECL=2016,BL0CKSIZE=24192,DEN=4) 
DSN=0LD.DATA1,V0L=SER=111111,LABEL=(1,SL),UNIT=HIGH9,DISP=SHR 
DSN-0LD.DATA2,V0L=SER=222222,LABEL=(1,SL),UNIT=AFF=IN,DISP=SHR 
DSN=0LD.DATA3,V0L=SER=333333,LABEL=(1,SL),UNIT=AFF=IN,DISP=SHR 

Example  of  Changing  Data  Tape  Format 

Example:  Tape  has  two  extra  characters  (in  this  case  blanks)  at  the  end 
of  the  last  record  in  each  block.  In  this  example  the  record  format  was  'F.' 
It  was  necessary  to  change  the  record  format  to  'U'  instead  of  'F'  or  the  last 
block  would  not  copy  because  it  was  a  short  block.  Without  this  change  when 
copying  the  project  would  have  to  program  around  the  problem  of  the  extra 
characters  each  time  the  tapes  are  used. 

//JOB  CARD 

//STEPl  EXEC  SAS 

//TAPEIN     DD  DSN=0LD. DATA, DISP=(OLD, KEEP), 

//  UNIT=TAPE9,V0L=SER=000000,LABEL=(,BLP), 

//  DCB=(RECFM=U,LRECL=2872,BL0CKSIZE-2872) 

//TAPEOUT    DD  DSN=NEW. DATA, DISP=(NEW, KEEP) , 

//  UNIT=HIGH0,V0L=SER=111111,LABEL=(1,SL),DCB=DEN=4 

DATA  NULL_; 

INFILE  TAPEIN  START=S  LENGTH=L 

INPUT; 

S=l;   L=2870; 

FILE  TAPEOUT  LRECL=410  BLKSIZE=28700  ; 

PUT  INFILE  ; 


"Finding"  Programs 


The  new  archivist  should  always  be  on  the  lookout  for  programs  to  use  in  the 
archive.  Following  is  a  short  PLl  program  "found"  at  Rand  and  used  by  the 
archive  staff  as  needed.  This  nice  little  program  was  written  to  "clean"  the 
end  of  a  tape.  Sometimes  we  think  we  can  write  one  more  data  set  to  the  end 
of  a  tape  only  to  find  out  we  have  a  short  reel  and  only  part  of  a  data  file 
was  written.  Rather  than  have  a  tape  with  partial  data  we  use  this  program  to 
write  a  new  tape  mark,  wiping  out  the  partial  data  set.  To  use  this  program 
is  is  necessary  to  know  the  next  file  number  (label  number)  after  the  last 
trailer  label  for  a  complete  data  file  (in  this  example  (4,blp)). 


//D0004R01  JOB  (2082,50,393),'MCGEE' ,CLASS=S 
//*  Author  -  Jerry  Hull,  The  Rand  Corporation 
//EXEC  PLIOCLG 
//PLI.SYSIN  DD  * 

EOF:  PROCEDURE  OPTIONS  (MAIN): 

DCL  SCR  FILE; 

OPEN  FILE  (SCR)  OUTPUT; 

CLOSE  FILE  (SCR); 

END; 
/* 

//GO. SCR  DD  DCB=(RECFM=FB,LRECL=80,BLKSIZE=80), 
//  DISP-(OLD,KEEP)  ,DSNAME=DUMMY ,LABEL=(4,BLP) , 
//*  YOU  MUST  USER  BLP  IN  ORDER  FOR  THE  FILE  TO  BE  COMPLETELY  WIPED  OUT. 

//  UNIT=HIGH9,V0L=SER=008607 


Data  Service  in  a  Computer  Center 

-  continued  from  page   14 

also  often  leads  to  use  by  individuals, 
agencies,  and  private  firns  of  our  con- 
tract programminq ,  an  activity  that  has 
expanded  to  the  point  that  extra  staff- 
ing is  needed.  One  of  our  goals  for 
the  future  is  to  be  able  to  expand 
staff  to  handle  more  of  these  of  these 
assignments. 

Although  a  great  deal  has  been  done 
with  the  Census  at  Rutgers,  more  Census 
activity  is  planned.  We  intend  to 
develop  mapping  techniques  using  Census 
data  and  streamline  the  Census  program- 
ming methods  currently  in  use.  Because 
of  the  tremendous  interest  in  Census 
data,  plans  are  being  made  to  increase 
the  support  to  microcomputer  users  who 
wish  to  analyze  Census  data.  The  com- 
plexity of  these  files  make  subsetting 
time-consuming,  but  it  is  an  aspect  of 
our  work  increasingly  in  demand.  All 
of  these  activities  need  the  publicity 
necessary  to  acquaint  the  University 
community  with  them.  Although  our  many 
workshops,  seminars,  and  publications 
are  widely  distributed  at  Rutgers, 
still  greater  efforts  will  be  made  in 
the  future. 


Care  and  Feeding  of  Magnetic  Tapes 

-continued  from  page  11 

Computer  Magnetic  Tape  File  Pro- 
perties. Federal  Information 
Processina  Standards  Publication 
53.   1978. 

(4)  Magnetic  Tape  Management-A  Guide 
to  Achieving  Reliable  Performance 
from  Your  Computer  Tapes-Includ- 
ing 6250  BPI.  Computer-Link 
Corp.,  40  Ray  Avenue,  Burlington, 
MA  01803  (617)  272-7400.   1982. 

For  general  reading: 

Davis,  William  S.   Information  Pro- 
cessing Systems.  Second  Edition. 
Philippines.  Addison-Wesley , 
1981.  Chapter  10. 
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Like  the  traditional  library,  the 
data  archive  performs  many  different 
functions  to  meet  the  needs  of  users. 
These  functions  include  data  acquisi- 
tion and  cleaning,  development  of 
conventions  and  standards  for  descrip- 
tion of  the  data,  data  processing  and 
analysis,  dessimination  of  information 
about  the  data,  storage  and  maintain- 
ance  of  data  tapes,  development  of  an 
inventory  system  and  inventory  con- 
trols as  well  as  a  data  retrieval  sys- 
tem, diffusion  of  the  data,  training 
for  archive  users  and  program  develop- 
ment. 

Data  Acquisition 

Obtaining  new  materials  for  archive 
holdings  from  some  continuing  sources 
of  supply  requires  establishment  of 
both  formal  and  informal  arrangements 
with  institutions,  departments  or 
bureaus  that  produce  data  on  a  regular 
basis  in  order  to  obtain  some  or  all 
of  their  productions.  It  is  also 
necessary  to  establish  priorities  for 
the  kind  of  data  to  be  acquired. 
Since  the  cost  of  processing  and  main- 
taining a  data  set  is  often  greater 
than  the  cost  of  acquisition,  selection 
must  be  made  with  great  care.  Ephemeral 
or  frequently  replicated  data  sets 
should  be  acquired  only  when  there 
is  a  concrete  need  for  them  since 
there  is  a  high  probability  of  being 
able  to  obtain  popular  data  sets  else- 
where if  a  local  need  develops.  The 
cost  of  acquiring,  cleaning,  indexing, 
and  maintaining  a  data  set  should  be 
considered  in  relation  to: 

(a)  the  likelihood  of  there  being 
multiple  users; 

(b)  the  possibility  of  acquiring  at  a 
later  date  if  the  need  should 
arise; 


(c)  its  availability  at  a  reasonable 
cost  and  with  little  delay  from 
some  other  source; 

(d)  the  amount  of  overlap  with  the 
existing  collection; 

(e)  the  intrinsic  significance  of  the 
data. 

The  form  in  which  data  arrives  varies 
from  supplier  to  supplier  and  from  study 
to  study.  This  can  result  in  a  great 
amount  of  time  being  spent  figuring  out 
just  what  it  is  you  have  received.  The 
ultimate  answer  is  to  have  funding 
sources  or  institutions  conducting  the 
survey  require  that  arrangements  for 
archiving  be  made  prior  to  the  actual 
funding  or  conducting  of  the  research. 
This  way,  the  archive  can  be  involved 
from  the  beginning  and  provide  guide- 
lines and  standards  for  researchers. 
A  formal  way  to  handle  this  would  be 
the  development  of  an  institutional 
policy  on  minimal  standards  for  data 
to  be  turned  over  to  the  archive.  A 
policy  statement  of  this  type  would 
insure  that  the  datasets  turned  over 
to  the  archive  meet  the  criterion  of 
methodological  adequacy.  It  has  been 
the  case  that  an  archive  decided  to 
pass  up  a  study  of  great  substantive 
interest  which  appears  to  have  been 
done  in  such  a  poor  manner,  utilizing 
such  sloppy  and  shoddy  techniques  of 
data  gathering  or  documentation  that, 
despite  the  interest  of  the  subject 
matter,  the  data  set  is  not  worth  ac- 
quiring. General  archive  operating 
policy  should  include  a  list  of  cri- 
teria which  studies  should  meet  if  the 
data  set  is  to  be  considered  accept- 
able. At  very  minimum  the  following 
documentation  should  be  available  for 
each  study: 

-continued 


-continued  from  page  23 

(a)  complete  and  accurate  codebook  or 
description  of  the  data  structure; 

(b)  description  of  the  data  format; 

(c)  illustrations  of  structure  and 
format; 

(d)  total  size  of  data  set; 

(e)  complete  and  accurate  description 
of  the  organization  of  the  files 
for  the  medium  in  which  the  data 
is  stored; 

(f)  precise  definition  for  each  data 
element; 

(g)  complete  explanation  of  all  codes 
used; 

(h)  sample  of  documents  used  in  data 
gathering; 

(i)  description  of  sampling  procedures 
employed,  with  intended  and  resul- 
tant sample  size; 

(j)  summary  of  training  provided  field- 
workers  and  coders; 

(k)  description  of  data  collection  pro- 
cedures; 

(1)  name  and  current  address  of  study 
director. 

Data  Cleaning 

In  its  most  simplified  form,  data 
cleaning  involves  processes  aimed  at 
placing  data  into  a  format  that  is 
easily  handled  by  computers.  These 
processes  include  identifying  and 
correcting  possible  discrepancies  be- 
tween the  actual  format  of  the  data 
and  the  descriptions  of  that  format. 
Many  archives  employ  specialized  staff 
members  who  do  this  type  of  data  clean- 
ing. 

Development  of  Conventions  &  Standards 


In  order  to  facilitate  the  process 
of  utilizing  data  initially  prepared  by 
others  it  is  necessary  to  establish  con- 
ventions for  coding  and  standards  for 
describing  the  data  themselves  in  order 
to: 

(a)  permit  combining  information  from 
different  collections  in  some  rea- 
sonable way; 

(b)  combine  samples  from  different 
studies  in  order  to  increase 


the  number  of  cases; 

(c)  make  comparisons  among  data  sets; 

(d)  facilitate  later  analysis. 

Data  Processing  and  Analysis 

The  data  processing  and  analysis 
function  of  the  archive  provides  for 
the  manipulation  of  data  for  the  user's 
purpose.  This  may  entail  the  refor- 
matting of  data  for  use  at  the  user's 
local  facility  or  providing  specially 
prepared  subset  of  cases  or  variables 
rather  than  a  simple  copy.  Some  users 
may  need  a  frequency  distribution  for 
the  variables  (if  not  provided  in  the 
codebook)  or  simple  cross-tabulations. 
Other  users  may  need  more  detailed 
statistical  analyses. 

Dissemination  of  Documentation 

The  most  important  documentation  pro- 
duced by  the  archive  is  the  codebook 
describing  the  dataset.  Archive  staff 
also  prepare  abstracts  of  data  sets  for 
inclusion  in  a  catalog  and  for  adver- 
tising purposes.  Production  of  some 
type  of  archive  catalog  is  almost  man- 
datory since  it  provides  not  only  an 
in-house  listing  of  current  holdings 
but  is  also  the  best  way  for  a  user  to 
browse  the  contents  of  the  archive. 
Advertising  the  availability  of  the 
data  can  take  many  forms.  Some 
archives  prepare  and  distribute  their 
own  newsletter  announcing  new  acquisi- 
tions while  others  include  a  special 
data  announcement  section  in  an  exist- 
ing institution  newsletter.  Archives 
should  also  strive  to  maintain  a  col- 
lection of  published  material  related 
to  the  data  sets  in  order  to  provide 
examples  of  how  the  data  have  been 
analyzed  already  and  clarify  ambiguities 
in  the  interpretation  of  the  data. 

Storeage  and  Maintenance 

Internal  procedures  must  be  esta- 
blished to  identify  the  current 
storage  location  of  all  materials. 
Magnetic  tapes  must  be  stored  in  a 
controlled  temperature  environment 
and  protected  from  magnetic  flux  and 


Iphysical  shock.  They  must  be  recopied 
on  a  periodic  basis  in  order  to  assure 
their  continued  utlity  and,  where  usage 
is  heavy,  to  protect  against  deterior- 
ation due  to  machine-induced  wear. 
(See  Patricia  Reslcok's  article  for  a 
detailed  discussion  of  tapes.) 

Inventory 


As  with  any  collection,  it  is  nec- 
essary that  the  archive  maintain  a 
catalog  or  index  of  holdings.  The 
archivist  might  consider  maintaining 
a  "public"  catalog  and  an  annotated 
"private"  catalog  with  additional  infor- 
mation. The  "private"  catalog  would 
include  abstracts  of  studies  added  to 
the  collection  since  the  last  published 
catalog  update.  It  is  also  necessary 
to  develop  an  internal  inventory  system 
to  keep  track  of  the  current  status  of 
all  studies  in  the  archive  including 
studies  "on  order"  or  being  processed. 
Other  internal  inventory  materials  would 
include  a  catalog  of  tapes  by  tape  or 
storage  number  and  a  catalog  of  studies 
by  study  number. 

Retrieval 


Requests  from  users  for  access  to 
data  relating  to  their  particular  topic 
of  interest  often  requires  the  archive 
to  search  not  only  its  own  holdings  but 
those  of  other  archives  as  well  and 
where  necessary,  to  obtain  from  other 
archives  those  materials  required  to 
serve  the  needs  of  the  user.  For  this 
reason  it  is  advisable  for  the  archive 
to  maintain  a  collection  of  catalogs 
from  other  archives  and  to  become  fami- 
liar.with  the  general  class  of  holdings 
at  other  archives.  Most  archivists 
find  it  helpful  to  maintain  personal 
contact  with  other  archivists  through 
the  network  established  by  professional 
associations  (like  lASSIST)  in  order  to 
facilitate  the  exchange  of  information 
about  data  holdings  and  to  keep  abreast 
of  technological  developments  in  this 
field. 


Diffusion 

The  data  archive  specializes  in  copy- 
ing its  own  collection  and  making  it 
available  to  the  user  at  his  convenience, 
in  the  form  most  suitable  to  his  pur- 
poses. However,  the  archive  must  still 
maintain  control  over  access  to  their 
materials  in  accordance  with  any  wishes 
of  the  original  donor.  For  this  pur- 
pose the  archivist  usually  develops  a 
form  letter  which  the  user  signs  agree- 
ing to  archive  terms.  Once  the  data 
have  been  copied,  the  archivist  fills 
out  a  standard  form  to  send  along  with 
the  data  tape  which  describes  the  files 
on  the  tape. ..number  of  files,  logical 
record  length,  blocksize,  number  of 
records... as  well  as  general  tape  char- 
acteristics. .  .tracks,  density,  format, 
character  set,  and  internal  labeling. 
Shipping  data  on  magnetic  tape  requires 
that  the  tape  be  adequately  packaged  to 
prevent  damage  in  transit  and  labeled 
on  the  outside  of  the  package  as  being 
a  MAGNETIC  TAPE  with  a  warning  to  keep 
the  package  away  from  magnets  and 
electric  motors  which  could  destroy 
the  data  set  stored  on  the  tape.  Tapes 
sent  through  the  mails  are  often  insured 
for  the  cost  of  recopying  the  data  and 
have  a  return  receipt  included  with 
mailing. 

Training 

Archives  can  perform  a  training  func- 
tion by  teaching  users  how  to  make  a 
query;  where  to  make  a  query  so  that  the 
appropriate  data  can  be  obtained;  how  to 
utilize  the  data  once  it  is  obtained; 
the  devices  available  for  processing; 
the  strategies  to  be  employed  for  analy- 
sis; and  the  kinds  of  interpretations 
that  can  be  made  from  such  analysis. 

The  archivist  may  wish  to  prepare  a 
user  manual  for  distribution  to  potential 
users  documenting  how  their  archive  is 
organized  and  including  information  on 
archive  services  and  locally  available 
-continued  on  page  29 
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Historical  Background 


Baruch  College,  originally  the  busi- 
ness School  of  the  City  College  of  New 
York  (CCNY),  is,  since  1968,  one  of  the 
eight  senior  colleges  in  the  City 
University  of  New  York  (CUNY).  Although 
particularly  strong  in  the  field  of 
business,  most  academic  fields  are 
represented  in  its  curriculum  with  the 
departments  organized  into  three  schools: 
Business,  Liberal  Arts  and  Education. 
The  college  awards  business  and  liberal 
arts  undergraduate  degrees  as  well  as 
the  MBA,  several  other  master's  degrees 
and  a  Ph.D.  in  business.  Increased 
interest  in  data  resources  on  campus 
has  paralleled  a  new  emphasis  on  com- 
puterization. There  had  always  been 
some  use  of  secondary  data  on  campus. 
Several  important  machine-readable 
data  files  were  already  available, 
scattered  through  several  different 
departments,  unorganized  and  with  little 
documentation  or  information  available 
outside  the  particular  departments 
which  possessed  the  files.  Use  was 
limited  to  those  few  who  knew  the  files 
existed  and  who  knew  how  to  access  them. 
Under  the  leadership  of  Professor 
Thomas  V.  Atkins,  Deputy  Chairman  for 
Library  Instruction  Services,  the 
library  was  able  to  effectively  con- 
vince the  college  administration  that 
data  files  should  be  conceived  of  as 
basically  an  information  resource  and, 
as  such,  the  college  library  was  the 
natural  place  for  a  data  service. 

In  the  Spring  of  1931,  a  Data 
Archives  Service  (DAS)  was  established 
as  part  of  the  library's  information 
services.  The  library  administration 


added  not  only  a  data  library  but  cre- 
ated an  educational  program  whose  func- 
tion was  to  inform  and  instruct  the 
Baruch  community  about  the  use  of 
secondary  data  as  an  information 
resource.  Because  of  its  instructional 
orientation  DAS  was  made  part  of  the 
Library  Instruction  Services  Division 
and  was  intended  to  complement  services 
already  provided  by  groups  on  campus 
such  as  the  Educational  Computer  Center, 
the 'Statistics  Lab,  etc.  Membership  in 
ICPSR  and  the  Roper  Center  were  begun 
immediately.  Data  from  sources  other 
than  ICPSR  or  Roper  was  purchased  on  an 
extremely  selective  basis  due  to  budget 
restrictions.  A  reference  collection 
of  manuals  and  data  catalogs  was  set  up 
for  use  with  the  growing  tape  collection, 
for  the  first  time  centralizing  to  some 
extent  the  documentation  for  both  main- 
frame software  and  secondary  data.  The 
training  programs  were  begun,  on  a 
limited  basis,  almost  immediately. 

Almost  at  once,  Baruch  College  began 
to  lobby  for  a  University-wide  ICPSR 
membership.  There  was  some  precedent 
for  this  since  there  had  been  a  previous 
City  University  ICPSR  membership  which 
had  lapsed  due  to  administrative  pro- 
blems. Baruch' s  efforts  joined  those 
of  a  significant  number  of  faculty  and 
administrators  at  other  CUNY  colleges 
who  had  long  been  interested  in  seeing 
a  return  of  the  university-wide  member- 
ship. Finally,  in  July  of  1983,  these 
combined  efforts  v^ere  successful  and  the 
senior  colleges  of  the  City  University 
of  New  York  became  a  federated  member 
of  ICPSR,  one  of  the  largest  federations 
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in  the  Consortium.  To  begin  with  the 
federation  included  only  the  four-year 
institutions.   It  is  assumed  that  the 
two-year  colleges  will  join  at  a  later 
time  if  there  is  sufficient  interest. 
To  encourage  the  success  of  the  federa- 
tion, Baruch's  administration,  both 
college  and  library,  willingly  accepted 
the  college's  appointment  as  coordina- 
tor for  the  new  membership. 

Funding 


As  is  often  the  case  with  academic 
institutions,  there  was  little  addi- 
tional funding  available.  When  the 
service  began  it  operated  on  the  pro- 
verbial "shoestring"  with  support  from 
the  library  budget  and  using  library 
personnel.  Some  additional  financial 
assistance  came  from  the  Title  III  grant 
awarded  to  Dr.  Atkins  for  the  develop- 
ment of  a  Graduate  Business  Resource 
and  Study  Center.  Sufficient  money 
for  computer  use  and  tape  storage  were 
allocated  from  the  general  research 
funds  of  the  Baruch  College  Educational 
Computer  Center. 

When  Baruch  became  the  coordinator 
of  the  City  University  ICPSR  membership, 
the  University  Chancellor's  Office  paid 
the  Federation's  membership  fee.  Half- 
time  services  of  Baruch's  data  archivist, 
additional  student  assistance,  and  some 
money  for  non-personnel  expenses  such  as 
documentation,  magnetic  tapes,  supplies, 
software  and  travel  funds  were  funded 
by  additional  support  from  the  Chancel- 
lor's Office  and  members  of  the  ICPSR 
Federation.  Since  Baruch  contributed 
its  facilities  and  the  services  of  its 
already  established  data  library,  it 
was  not  required  to  contribute  further 
funds.  University  support  is  limited 
to  expenditures  associated  with  the 
ICPSR  Federation  while  Baruch  uses  its 
own  funds  for  purchase  of  non-ICPSR 
data,  special  equipment  and  its  own 
data  services. 

Staffing 


Baruch's  data  archivist  is  assigned 
part-time  to  the  Baruch  service  and 
part-time  to  the  CUNY  Center  which  is 


also  staffed  by  a  part-time  graduate 
assistant  and  undergraduate  student 
assistants.  The  archivist,  a  trained 
librarian,  set  up  the  data  library, 
organized  the  tape  collection,  developed 
the  documentation  collection  of  appro- 
priate codebooks  and  manuals,  and  esta- 
blished the  research  consultation 
service.  Once  organized,  basic  main- 
tenance of  the  tape  and  reference 
collection  have  been  assumed  by  the 
graduate  assistant,  who  also  provides 
assistance  with  the  development  of 
educational  programs.  As  the  service 
expands,  it  is  expected  that  additional 
graduate  assistance  will  be  needed. 
Undergraduate  students  assist  with  cler- 
ical duties,  as  does  the  secretarial 
staff  of  the  Library  Instruction  Divi- 
sion. The  staffing  is  based  on  an 
assumption  that  computer  and  statis- 
tical consultation  is  available  from 
Baruch  College's  Educational  Computer 
Center  or,  in  the  case  of  other  CUNY 
users,  at  their  home  campuses. 

Equipment 

The  City  University  has  a  large  cen- 
tral computer  installation  (CUNY/UCC) 
used  by  all  the  colleges  as  their  main 
facility.  The  hardware  at  the  UCC 
includes  an  IBM  3081,  an  IBM  3033,  and 
an  Amdahl  470/V6-II.  In  addition  to 
the  mainframes,  there  are  high  speed 
printers,  graphics  equipment  and  soft- 
ware installations  of  most  of  the  major 
statistical  packages.  Almost  all  the 
senior  colleges,  including  Baruch,  have 
supplementary  equipment  including  main- 
frames, minis  and  microcomputer  labs. 
CDA  stores  copies  of  its  tapes  at  the 
CUNY  UCC  on  a  permanent  basis  so  that 
they  are  easily  available  to  all  cam- 
puses. For  convenience  Baruch  facili- 
ties are  often  used  for  small  printing 
jobs  since  the  CUNY  UCC  is  located  some 
50  blocks  to  the  northwest. 

The  Center  irself  has  had  only  a 
Decwriter  300  band  printing  terminal  for 
maintenance  of  its  tape  collection  and 
"development  of  on-line  demonstrations 
for  seminars.  Recently  a  Volcker  Craig 
CRT  and  an  IBM  PC  XT  were  received. 


This  equipment  will  be  used  for  present 
activities  of  the  Center  in  addition  to 
development  of  instruction  in  and 
assistance  with  microcomputer  data 
analysis,  an  area  in  which  the  CDA 
intends  to  specialize.  For  training 
seminars  which  include  on-line  demon- 
strations of  secondary  data  resources, 
the  Center  has  had  access  to  the  Bar- 
uch  College  Graduate  Business  Study  and 
Resource  Center  seminar  room  which  is 
equipped  with  several  Decwriters  for 
workshop  participants  and  an  Electro- 
home  Projector  which  projects  an 
enlarged  image  from  a  video  terminal. 

Physical  Environment 


If  the  Data  Archives  Service  at 
Baruch,  or  for  that  matter,  the  CUNY 
Data  Service  had  waited  for  proper 
housing,  it  would  not  exist  today. 
Space  is  at  a  premium  almost  everywhere 
in  the  University,  but  nowhere  more 
than  at  Baruch  College.  The  data 
library  was  begun  in  one  of  the  faculty 
offices  of  the  Library  Instruction  Div- 
ision which  at  that  time  housed  three 
other  faculty  members  in  addition  to 
the  part-time  data  archivist.  At  this 
writing,  a  separate  room  serves  as 
office  space  for  the  archivist,  as  a 
workroom  for  maintenance  for  the  data 
collection,  a  storage  room  for  the 
documentation  collection,  and  as  con- 
sultation space.  There  is  some  addi- 
tional storage  space  for  the  master  tape 
copies.  Presently,  plans  are  being  made 
to  acquire  additional  space  which  will 
serve  as  a  workroom  for  the  students  in 
maintaining  the  tape  collection  and  the 
data  archives  files.  '  The  original 
office  will  then  be  freed  for  use  as 
office  space  and  for  data  consultations. 
On  the  desiderata  list  is  a  terminal 
room  for  users  which  will  encourage  data 
use  in  the  data  library  and  allow  for 
on-line  data  consultations. 

Dissemination  of  Information 


When  the  Baruch  Data  Archives  Search 
Service  became  the  coorindator  for  the 
CUNY  ICPSR  membership  it  expanded  upon 
its  own  primary  emphasis,  the  education 


of  faculty  and  graduate  students,  to 
bringing  about  a  general  awareness  of 
the  potential  of  secondary  data  analysis 
and  ICPSR  files  in  particular.  A  monthly 
annotated  list  of  ICPSR  data  available 
to  all  CUNY  faculty  is  mailed  to  the 
campus  liasons,  the  libraries  and  speci- 
fic data  users.  At  Baruch,  the  CUNY 
list  is  supplemented  with  a  listing  of 
datasets  available  to  only  Baruch 
faculty  and  students.  A  Baruch  data 
directory  is  in  preparation  which  will 
list  and  index  by  subject  all  data 
available  on  campus  including  non- 
bibliographic  databases  accessed  through 
Computer  Search  Services.  These  general 
listings  are  supplemented  by  data  bibli- 
ographies on  specific  topics  which  are 
prepared  for  seminars  and  then  mailed 
on  request.  The  semi-annual  newsletter 
published  by  the  Baruch  Graduate  Busi- 
ness Study  and  Resource  Center  and 
mailed  to  all  departments  has  carried 
a  section  on  the  Baruch  Data  Archives 
since  its  inception.  But  timely  infor- 
mation and  publicity  for  the  entire  CUNY 
community  remains  a  difficult  problem 
because  staff  is  limited  and  the  commun- 
ity to  be  served  is  large,  disparate  and 
separated  by  sizeable  distances. 

For  Baruch  and  CUNY  the  most  important 
methods  of  reaching  out  to  both  new  and 
sophisticated  data  users  has  been  the 
training  program  and  the  seminar  series. 
Each  seminar  focuses  on  the  information 
resources  for  the  study  and  teaching  of 
a  particular  inter-disciplinary  topic. 
Each  seminar  includes  some  mention  of 
bibliographic  databases  available  on  the 
topic  as  well  as  a  brief  overview  of 
information  sources  in  print.  The  great 
est  portion  of  the  two  hours,  however, 
is  devoted  to  machine-readable  data 
files,  particularly  those  from  ICPSR. 
Where  appropriate,  government  data  files 
or  other  data  archives  which  specialize 
in  the  topic  are  mentioned.  An  on-line, 
interactive  demonstration  of  the  con- 
tents of  an  important  dataset  in  the 
field  concludes  each  seminar.  Topics 
covered  have  been  Urban  Problems,  Women, 
Youth,  and  Consumer  Behavior.  Plans 
for  the  future  include  seminars  in  this 


format  on  the  national  elections,  health 
care,  education,  marketing,  etc.  The 
seminars  held  at  Baruch  have  been  sup- 
plemented by  visits  to  the  individual 
college  campuses  with  a  "What  is  ICPSR" 
format  also  including  on-line  demon- 
strations. 

Future  Develooments 


At  this  point  it  is  expected  that 
the  CUNY  ICPSR  Federation  will  con- 
tinue as  an  integral  part  of  the  uni- 
versity's information  resources  and 
Baruch  will  maintain  its  own  data  ser- 
vice as  well.  Plans  for  the  future  for 
both  services  are  inextricably  linked. 
First  on  the  agenda  is  the  solution  to 
the  problem  of  reaching  the  huge  CUNY 
community.  In  some  part  this  will  be 
done  by  renewed  emphasis  on  successful 
programs,  but  additional  services  will 
be  offered  as  finances  and  personnel 
permit. 

We  intend  to  continue  and  increase 
training  in  the  availability  of  data 
resources  and  how  they  are  used  for 
research  and  teaching.  These  seminars, 
we  believe,  assist  in  motivating  faculty 
to  enlarge  their  uses  of  data  and  secon- 
dary analysis  both  in  research  and  in 
instruction.  Both  the  interdisciplin- 
ary seminars  held  at  Baruch  and  the 
ones  held  at  the  individual  colleges 
will  be  increased.  In  addition, 
"hands-on"  workshops  actually  using 
data  will  be  held  in  the  specially 
equipped  Baruch  on-line  classrooms. 
Not  only  our  training  will  increase, 
but  also  our  services.  We  intend  to 
facilitate  faculty  and  student  use  of 
microcomputers  for  data  analysis  by 
purchasing  or  preparing  subsets  on 
diskette  and  supporting  microcomputer 
statistical  packages.  Special  workshops 
are   planned,  for  example,  on  the  use 
of  ABC,  ICPSR's  instructional  statisti- 
cal package.  As  a  part  of  this  effort, 
special  instructional  packages  will  be 
developed  similar  to  teaching  packages 
prepared  by  the  Library  Instruction 
Division  for  bibliographic  instruction. 
These  are  intended  for  use  in  our 
course-related  lectures  series. 


Our  informational  services  will  ex- 
pand this  year  with  a  new  data  direc- 
tory containing  tape  access  information, 
a  subject  index  and  enlarged  annota- 
tions, for  Baruch  this  will  tie  to- 
gether all  campus  files,  while  the  CUNY 
edition  will  list  ICPSR  data  available. 
This  directory  is  a  preliminary  step 
in  our  long-term  goal  of  having  an 
on-line  dataset  catalog.  Publication 
of  our  newsletter  on  a  regular  basis 
and  special  publications  on  major  data- 
sets  are  also  in  our  plans  for  future 
development. 


DATA  ARCHIVE  MANAGEMENT 
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statistical  packages  used  for  analy- 
sis. Another  useful  manual  would  be 
an  internal  document  describing  the 
operational  procedures  of  the  archive 
for  use  in  training  new  staff. 

Program  Development 

Archives  can  play  a  role  in  the 
development  of  new  programs,  particu- 
larly in  the  collection  and  creation 
of  specialized  data  and  by  encourag- 
ing the  creation  of  new  computing 
power. 

The  author  wishes  to  thank  David 
Nasatir,  whose  article   "Operational 
Considerations  of  Archives"  in  Howard 
White's   Reader  in  Machine-Readable 
Social  Data  served  as  the  basis  for 
both  the  workshop  presentation  and 
this  article. 
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